The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Temporal action localization (TAL) aims to detect the boundary and identify the class of each action instance in a long untrimmed video. Current approaches treat video frames homogeneously, and tend to give background and key objects excessive attention. This limits their sensitivity to localize action boundaries. To this end, we propose a prior-enhanced temporal action localization method (PETAL), which only takes in RGB input and incorporates action subjects as priors. This proposal leverages action subjects' information with a plug-and-play subject-aware spatial attention module (SA-SAM) to generate an aggregated and subject-prioritized representation. Experimental results on THUMOS-14 and ActivityNet-1.3 datasets demonstrate that the proposed PETAL achieves competitive performance using only RGB features, e.g., boosting mAP by 2.41% or 0.25% over the state-of-the-art approach that uses RGB features or with additional optical flow features on the THUMOS-14 dataset.
translated by 谷歌翻译
Deep learning technology has made great progress in multi-view 3D reconstruction tasks. At present, most mainstream solutions establish the mapping between views and shape of an object by assembling the networks of 2D encoder and 3D decoder as the basic structure while they adopt different approaches to obtain aggregation of features from several views. Among them, the methods using attention-based fusion perform better and more stable than the others, however, they still have an obvious shortcoming -- the strong independence of each view during predicting the weights for merging leads to a lack of adaption of the global state. In this paper, we propose a global-aware attention-based fusion approach that builds the correlation between each branch and the global to provide a comprehensive foundation for weights inference. In order to enhance the ability of the network, we introduce a novel loss function to supervise the shape overall and propose a dynamic two-stage training strategy that can effectively adapt to all reconstructors with attention-based fusion. Experiments on ShapeNet verify that our method outperforms existing SOTA methods while the amount of parameters is far less than the same type of algorithm, Pix2Vox++. Furthermore, we propose a view-reduction method based on maximizing diversity and discuss the cost-performance tradeoff of our model to achieve a better performance when facing heavy input amount and limited computational cost.
translated by 谷歌翻译
Despite high global prevalence of hepatic steatosis, no automated diagnostics demonstrated generalizability in detecting steatosis on multiple international datasets. Traditionally, hepatic steatosis detection relies on clinicians selecting the region of interest (ROI) on computed tomography (CT) to measure liver attenuation. ROI selection demands time and expertise, and therefore is not routinely performed in populations. To automate the process, we validated an existing artificial intelligence (AI) system for 3D liver segmentation and used it to purpose a novel method: AI-ROI, which could automatically select the ROI for attenuation measurements. AI segmentation and AI-ROI method were evaluated on 1,014 non-contrast enhanced chest CT images from eight international datasets: LIDC-IDRI, NSCLC-Lung1, RIDER, VESSEL12, RICORD-1A, RICORD-1B, COVID-19-Italy, and COVID-19-China. AI segmentation achieved a mean dice coefficient of 0.957. Attenuations measured by AI-ROI showed no significant differences (p = 0.545) and a reduction of 71% time compared to expert measurements. The area under the curve (AUC) of the steatosis classification of AI-ROI is 0.921 (95% CI: 0.883 - 0.959). If performed as a routine screening method, our AI protocol could potentially allow early non-invasive, non-pharmacological preventative interventions for hepatic steatosis. 1,014 expert-annotated liver segmentations of patients with hepatic steatosis annotations can be downloaded here: https://drive.google.com/drive/folders/1-g_zJeAaZXYXGqL1OeF6pUjr6KB0igJX.
translated by 谷歌翻译
通过恢复(实体瘤的响应评估标准)自动测量病变/肿瘤大小,直径和分割对于计算机辅助诊断很重要。尽管近年来已经研究了它,但仍有空间可以提高其准确性和鲁棒性,例如(1)通过合并丰富的上下文信息来增强功能,同时保持高空间分辨率,(2)涉及新任务和损失以进行关节优化。为了实现这一目标,本文提出了一个基于变压器的网络(Meaformer,测量变压器),用于病变恢复直径预测和分割(LRDPS)。它被配制为三个相关和互补任务:病变分割,热图预测和关键点回归。据我们所知,这是首次使用按键重点回归进行恢复直径预测。 MeaeFormer可以通过使用变压器来捕获其远程依赖性来增强高分辨率功能。引入了两个一致性损失,以明确建立这些任务之间的关系,以更好地优化。实验表明,MeAformer实现了LRDP在大规模深层数据集上的最新性能,并在纵向研究中产生了两个下游诊所的任务,即3D病变细分和恢复评估。
translated by 谷歌翻译
为了应对人类检测对标签数据和隐私问题的不断增长的需求,合成数据已被用作替代品,并在人类检测和跟踪任务中显示出令人鼓舞的结果。我们参加了第七届基准测试多目标跟踪(BMTT)的研讨会,主题是“合成数据可以带我们多远”?我们的解决方案Pietrack是根据合成数据开发的,而无需使用任何预训练的权重。我们提出了一种自我监督的域适应方法,该方法能够减轻合成(例如Motsynth)和真实数据(例如Mot17)之间的域移位问题,而无需涉及额外的人类标签。通过利用拟议的多尺度合奏推理,我们在MOT17测试集中获得了58.7的最终HOTA得分,在挑战中排名第三。
translated by 谷歌翻译
在保持最佳控制性能的同时,减少传感器要求对于许多工业控制应用至关重要,以实现强大的,低成本和计算有效的控制器。但是,对于典型的机器学习域的现有特征选择解决方案几乎不可能通过变化的动态来控制在控制域中。在本文中,一个新颖的框架,即双世界嵌入式细心特征选择(D-AFS),可以有效地为动态控制下的系统选择最相关的传感器。 D-AFS并没有在大多数深度强化学习(DRL)算法中使用的一个世界,而是具有扭曲功能的现实世界和虚拟同行。通过分析在两个世界中DRL的响应,D-AFS可以定量确定各自特征对控制的重要性。众所周知的主动流控制问题,圆柱阻力减少,用于评估。结果表明,D-AFS成功地发现了比最先进的解决方案,比人类专家的五探针布局比最先进的解决方案进行了18.7 \%阻力的优化五探针布局。我们还将该解决方案应用于四个OpenAI经典控制案例。在所有情况下,D-AFS都比最初提供的解决方案获得相同或更好的传感器配置。我们认为,结果突出显示了为实验或工业系统实现高效和最佳传感器设计的一种新方法。我们的源代码可在https://github.com/g-yab/dafsfluid上公开提供。
translated by 谷歌翻译
联合学习(FL)可从分散的隐私敏感数据中学习,并在Edge客户端进行原始数据的计算。本文介绍了混合FL,其中包含在协调服务器上计算出的附加损失项(同时维护FL的私人数据限制)。有很多好处。例如,可以利用其他数据中心数据从集中式(数据中心)共同学习,并分散(联合)培训数据,并更好地匹配预期的推断数据分布。混合FL还可以将一些密集的计算(例如,将正则化)卸载到服务器中,从而大大减少了通信和客户端计算负载。对于这些和其他混合FL用例,我们提出了三种算法:平行训练,1向梯度转移和2向梯度转移。我们陈述了每种融合界限,并提供适合特定混合FL问题的直觉。最后,我们对三个任务进行了广泛的实验,表明混合FL可以将训练数据融合以达到推理分布上的准确性,并可以将通信和计算开销降低90%以上。我们的实验证实了关于算法在不同的混合FL问题设置下的性能的理论预测。
translated by 谷歌翻译
本文提出了一个新颖的像素间隔下采样网络(PID-NET),以较高的精度计算任务,以更高的精度计数任务。 PID-NET是具有编码器架构的端到端卷积神经网络(CNN)模型。像素间隔向下采样操作与最大功能操作相连,以结合稀疏和密集的特征。这解决了计数时茂密物体的轮廓凝结的局限性。使用经典分割指标(骰子,Jaccard和Hausdorff距离)以及计数指标进行评估。实验结果表明,所提出的PID-NET具有最佳的性能和潜力,可以实现密集的微小对象计数任务,该任务在数据集上具有2448个酵母单元图像在数据集上达到96.97 \%的计数精度。通过与最新的方法进行比较,例如注意U-NET,SWIN U-NET和TRANS U-NET,提出的PID-NET可以分割具有更清晰边界和较少不正确的碎屑的密集的微小物体,这表明PID网络在准确计数的任务中的巨大潜力。
translated by 谷歌翻译
最近的图像入介方法取得了长足的进步,但在处理复杂图像中的大孔时,通常很难产生合理的图像结构。这部分是由于缺乏有效的网络结构可以捕获图像的远程依赖性和高级语义。我们提出了级联调制GAN(CM-GAN),这是一种新的网络设计,由编码器组成,该设计由带有傅立叶卷积块的编码器组成,该块从带有孔的输入图像中提取多尺度特征表示,并带有带有新型级联全球空间调制的双流式解码器在每个比例尺上块。在每个解码器块中,首先应用全局调制以执行粗糙和语义感知的结构合成,然后进行空间调制以进一步以空间自适应的方式调整特征图。此外,我们设计了一种对象感知的培训方案,以防止网络在孔内部幻觉,从而满足实际情况下对象删除任务的需求。进行了广泛的实验,以表明我们的方法在定量和定性评估中都显着优于现有方法。请参阅项目页面:\ url {https://github.com/htzheng/cm-gan-inpainting}。
translated by 谷歌翻译